IE 423 Project Part 1

Group 13

Alper Vural 2016402075

Ali Burak Kaya 2015402165

İsmet Onur Abdioğlu 2016402186

Ertuğrul Arda 2015402147

TASKS

A computer case is selected for surface analysis.

Part 1 - Question 1

In [7]:
library(jpeg)
surface <- readJPEG("resim.jpg")
str(surface)
 num [1:512, 1:512, 1:3] 0.529 0.576 0.592 0.463 0.353 ...

Part 1 - Question 2.a

Our image is stored in the surface variable. str(surface) suggests that our image is defined within 3 matrices of the size 512x512. readJPEG function uses the RGB color system which distunguishes every pixel with respect to the density of main color red, the density of main color green, the density of main color blue. So that every color's density values are stored in seperate matrices.

In [52]:
surface_ras <- as.raster(surface)
par(mfrow=c(1,1))
plot(surface_ras)

Part 1 - Question 2.b

In [9]:
par(mfrow=c(3,1))
r<-apply(surface[,,1],2,rev)
g<-apply(surface[,,2],2,rev)
b<-apply(surface[,,3],2,rev)
image(1:ncol(r),1:nrow(r), as.matrix(t(r)), col=rgb(0:255,0,0,max=255),xlab="x",ylab="y",main="Red Channel")
image(1:ncol(g),1:nrow(g), as.matrix(t(g)), col=rgb(0,0:255,0,max=255),xlab="x",ylab="y",main="Green Channel")
image(1:ncol(b),1:nrow(b), as.matrix(t(b)), col=rgb(0,0,0:255,max=255),xlab="x",ylab="y",main="Blue Channel")

Question 1 - Part 3

In [10]:
par(mfrow=c(1,1))
colmean_R <- colMeans(r)
colmean_G <- colMeans(g)
colmean_B <- colMeans(b)
plot(colmean_R, main="3 Color Dimensions", ylab="", type="l", col="red",ylim=c(0,1))
lines(colmean_G,col="green")
lines(colmean_B,col="blue")
legend("topleft", c("R","G","B"), fill=c("red","green","blue"))

Question 1 - Part 4

In [12]:
r_top<-r[1:256,1:512]
r_bottom<-r[257:512,1:512]
r_diff<-abs(r_top-r_bottom)
diff_image_r<-as.raster(r_diff)
par(mfrow=c(2,1))
image(1:ncol(r),1:nrow(r), as.matrix(t(r)), col=rgb(0:255,0,0,max=255),xlab="x",ylab="y",main="Before Differentiation")
image(1:ncol(r_diff),1:nrow(r_diff), as.matrix(t(r_diff)), col=rgb(0:255,0,0,max=255),xlab="x",ylab="y",main="After Differentiation")
In [13]:
g_top<-g[1:256,1:512]
g_bottom<-g[257:512,1:512]
g_diff<-abs(g_top-g_bottom)
diff_image_g<-as.raster(g_diff)
par(mfrow=c(2,1))
image(1:ncol(g),1:nrow(g), as.matrix(t(g)), col=rgb(0,0:255,0,max=255),xlab="x",ylab="y",main="Before Differentiation")
image(1:ncol(g_diff),1:nrow(g_diff), as.matrix(t(g_diff)), col=rgb(0,0:255,0,max=255),xlab="x",ylab="y",main="After Differentiation")
In [14]:
b_top<-b[1:256,1:512]
b_bottom<-b[257:512,1:512]
b_diff<-abs(b_top-b_bottom)
diff_image_b<-as.raster(b_diff)
par(mfrow=c(2,1))
image(1:ncol(b),1:nrow(b), as.matrix(t(b)), col=rgb(0,0,0:255,max=255),xlab="x",ylab="y",main="Before Differentiation")
image(1:ncol(b_diff),1:nrow(b_diff), as.matrix(t(b_diff)), col=rgb(0,0,0:255,max=255),xlab="x",ylab="y",main="After Differentiation")

Our image is not homogene. The print on the case changes periodically so that when we differentiate the bottom from the top, there are circular black shapes on the differentiated image. If the selected surface was meant to be homogene, differentiated image would be black.

Part 1 - Question 5

In [30]:
library(EBImage)
par(mfrow=c(2,2))
size <- 5
mfiltered_surface_5 <- medianFilter(surface,size = size)
last_5 <- as.raster(mfiltered_surface_5)
size <- 11
mfiltered_surface_11 <- medianFilter(surface,size = size)
last_11 <- as.raster(mfiltered_surface_11)
size <- 31
mfiltered_surface_31 <- medianFilter(surface,size = size)
last_31 <- as.raster(mfiltered_surface_31)
plot(surface_ras)
text(x=270,y=28,labels=expression(bold("Original Image")))
plot(last_5)
text(x=270,y=28,labels=expression(bold("5x5 Filtered")))
plot(last_11)
text(x=270,y=28,labels=expression(bold("11x11 Filtered")))
plot(last_31)
text(x=270,y=28,labels=expression(bold("31x31 Filtered")))

We apply Median Filtering through medianFilter function in the library "EBImage". As it is expected, the variance of the pixel values of the original picture that we used for the previous questions is decreased after the operations.

Through median filtering, in other words, we stabilize the picture and make it more efficient to use for the quality engineering issues by decreasing the effect of outliers. By doing the same iteration with broader neigbors space, we observed that the method is not working as good as the smaller space. The number of the neighbors is up to the quality standarts of the company but when the number of neighbors is increased, the main shape conditions (mesh structure in this part) are also lost which can not be tolerated. So that when neighbor size increases every pixel will look the same since variance decreases. If we are seeking to catch small details in the image, small window size should be selected. On the other hand; if we are working with a surface meant to be homogene, window size should be selected larger in order to smooth the image.

Part 2 - Question 1

In [37]:
library(extraDistr)
library(jpeg)
library(car)
grey<- readJPEG("grey.jpg")
str(grey)
grey_image<-grey[,,1]
par(mfrow=c(1,1))
hist(grey_image,main="Frequency of Pixel Values",xlab="Pixel Values",ylab="Frequency")
 num [1:512, 1:512, 1:3] 0.38 0.408 0.424 0.29 0.18 ...
In [38]:
getmode <- function(v) {
  uniqv <- unique(v)
  uniqv[which.max(tabulate(match(v, uniqv)))]
}
a <- min(grey_image)
b <- max(grey_image)
c <- getmode(grey_image)
m <- mean(grey_image)
s <- sd(grey_image)
par(mfrow=c(1,1))
hist(rnorm(length(grey_image),m,s), main = "Original vs Normal vs Triangular",xlab = "Pixel Value",col = rgb(0,0,0.9,0.5),xlim=c(0,1))
hist(rtriang(length(grey_image),a,b,c),col=rgb(0.9,0,0,0.5),add=TRUE )
hist(grey_image,add=TRUE,col=rgb(0,0.9,0,0.5))
legend("topleft", c("Triangular","Original","Normal"), fill=c(rgb(0.9,0,0,0.5),rgb(0,0.9,0,0.5),rgb(0,0,0.9,0.5)))

First of all, we drew the histogram of the pixel values. Green coloured histogram is the original model histogram. Later on, we drew the normally distributed histogram by using mean and sd parameters. The mean of the population is 0.443658 and the standard deviation is 0.1849244. Blue histogram is normally distributed.

Lastly, after seeing that there is a difference between normally distributed histogram and the original histogram, we thought that there could be a better distribution to acknowledge original data. We tried triangle distribution. Min, max and the mode has been found. Their values are respectively 0.01568627, 0.9137255, 0.3960784.

Triangular distribution fits the data well as we can see in the histogram.

Part 2 - Question 2

In [54]:
set.seed(366)
x <- sample(grey_image,5000)
set.seed(366)
ks.test(x,rtriang(length(x),min(x),max(x),getmode(x)))
Warning message in ks.test(x, rtriang(length(x), min(x), max(x), getmode(x))):
"p-value will be approximate in the presence of ties"
	Two-sample Kolmogorov-Smirnov test

data:  x and rtriang(length(x), min(x), max(x), getmode(x))
D = 0.019, p-value = 0.3275
alternative hypothesis: two-sided

We estimated the parameters by using the data. For the triangular distribution length, min , max and mode is needed

Their values are respectively 5000, 0.01568627, 0.9137255, 0.3960784.

After that we used Kolmogorov-Smirnov test and got the p-value of 0.3275. We cannot say that original data does not have triangular distribution. We fail to reject null hypothesis so that further on it is assumed the data has triangular distribution.

In [73]:
min <- min(grey_image)
max <- max(grey_image)
mode <- getmode(grey_image)
cat("max:",max," ")

cat("min:",min," ")

cat("mode:",mode)
max: 0.9137255  min: 0.01568627  mode: 0.3960784

Estimated parameters are as following: max: 0.9137255 min: 0.01568627 mode: 0.3960784

Part 2 - Question3

In [46]:
alpha <- 0.001
lower <- qtriang(alpha,a,b,c)#qnorm(alpha,m,s)
upper <- qtriang(1-alpha,a,b,c)#qnorm(1-alpha,m,s)
grey_image_out <- grey_image
number_out <- as.integer(0)
for(i in 1:nrow(grey_image)){
  for(j in 1:ncol(grey_image)){
    if(grey_image[i,j] > upper | grey_image[i,j] < lower){
      grey_image_out[i,j] <- 0
      number_out <- number_out +1
    }
  }
}
number_out
par(mfrow=c(1,2))
plot(as.raster(grey_image),main= "Original")
text(x=230,y=600,labels=expression(bold("Original")))
plot(as.raster(grey_image_out),main="Without Outliers")
text(x=230,y=600,labels=expression(bold("Without Outliers")))
77

Alpha value is 0.001 as it was given in the question. By giving parameter values to qtriang , we found upper limit and the lower limit. That function is working like qnorm function. After that we created a variable called number_out to find out total number of outliers.

2 for loops are created to search the matrix in two dimension. If the value of the pixel is higher then upper limit or lower than the lower limit, number_out increases by one unit and that value of the pixel is set to 0. Setting the value of that pixel means making it black.

number_out value at the end is 77. Among 262,144 elements having only 77 outliers means the original image pixels fits the triangular distributon well. So, there are limited number of outliers in the data. When we display both images, it is hard to observe the difference.

Part 2 - Question 4

In [51]:
alpha <- 0.001
grey_image_out_patch <- grey_image
number_out_patch <- as.integer(0)
lower_patch <- matrix(0,10,10)
upper_patch <- matrix(0,10,10)

for(i in 1:10){
  for(j in 1:10){
    min_p<- min(grey_image[((i-1)*51+1):(i*51),((j-1)*51+1):(j*51)])
    max_p<- max(grey_image[((i-1)*51+1):(i*51),((j-1)*51+1):(j*51)])
    mode_p<- getmode(grey_image[((i-1)*51+1):(i*51),((j-1)*51+1):(j*51)])
    lower_patch[i,j] <- qtriang(alpha,a=min_p,b=max_p,c=mode_p)
    upper_patch[i,j] <- qtriang(1-alpha,a=min_p,b=max_p,c=mode_p)
  }
}
for(i in 1:509){
  for(j in 1:509){
    upper_p <- upper_patch[floor(i/51)+1,floor(j/51)+1]
    lower_p <- lower_patch[floor(i/51)+1,floor(j/51)+1]
    if(grey_image[i,j] > upper_p | grey_image[i,j] < lower_p){
      grey_image_out_patch[i,j] <- 0
      number_out_patch <- number_out_patch +1
    }
  }
}

number_out_patch
par(mfrow=c(1,2))
plot(as.raster(grey_image),main= "Original")
text(x=230,y=600,labels=expression(bold("Original")))
plot(as.raster(grey_image_out_patch),main="Without Outliers")
text(x=230,y=600,labels=expression(bold("Without Outliers")))
968

In the last part of the project, we did the same operations basically with a smaller picture(smaller sample). Number of outliers in this case was much more than the original image. when the sample size decreases, the standart deviation of the sample increases and larger standart deviation lead to scattered data points. Naturally, the scattered data points can cause to greater number of outlier pixels. Black points can be observed easily since there are more outliers than the previous question. Outliers are spotted mainly around the colour shifts. When there is a shift from a darker colour to lighter colour or visa versa , outliers can be seen.

In terms of the quality controlling , taking smaller pictures can decrease the outlier effects in the picture better but the control scope may not be large enough to detect the problematic tissue. Also in this part, the choice between these options depends on the quality policy of the company.